Add new transcoding extension #168

Tolriq · 2025-07-15T09:02:38Z

New extension to have proper transcoding solution in OpenAPI.

This is a WIP to start the many discussions that this will bring.

netlify · 2025-07-15T09:02:42Z

✅ Deploy Preview for opensubsonic ready!

Name	Link
🔨 Latest commit	`7c33949`
🔍 Latest deploy log	https://app.netlify.com/projects/opensubsonic/deploys/692caad1f713810008a01876
😎 Deploy Preview	https://deploy-preview-168--opensubsonic.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

kgarner7

Only reviewed the markdown. General thoughts:

It's extremely important to describe the behavior of what happens when you specify multiple transcoding profiles. Which one will the server return (it can only return one).
I'm a bit worried that the limitations would be a bit too heavy for a server. Not entirely sure.
Some parts (codec, container) could probably be made more explicit on the format

All things said, I do think this is a good first pass. I'm worried that it's a bit heavy for the server (especially if a large number of profiles/codecs/limitations is provided), but it should be good especially for mobile clients.

content/en/docs/Endpoints/getTranscodeDecision.md

content/en/docs/Endpoints/getTranscodeStream.md

content/en/docs/Endpoints/getTranscodeDecision.md

content/en/docs/Responses/transcodeDecisionResponse.md

content/en/docs/Responses/DirectPlayProfile.md

content/en/docs/Payloads/ClientInfo.md

content/en/docs/Payloads/Limitation.md

Tolriq · 2025-07-20T07:21:59Z

It's extremely important to describe the behavior of what happens when you specify multiple transcoding profiles. Which one will the server return (it can only return one).

The transcoding profiles are in order of preferences, the server returns the first it can, the details are present transcode decision answer.

I'm a bit worried that the limitations would be a bit too heavy for a server. Not entirely sure.

Limitations are the core of the decision process, plex, emby, jellyfin all works with the same concepts, it's relatively easy to manage server side.

Some parts (codec, container) could probably be made more explicit on the format

There was some discussion some times ago about returning details about the media and we did not reach a consensus.

I agree a consensus would be better as it would be used for the details returns for the tracks too, let's hope people can accept something.

lachlan-00 · 2025-07-23T02:56:59Z

This seems really complicated.

Is this more helpful for video servers like jellyfin where you can select various stream outputs from a list of available options?

Tolriq · 2025-07-23T05:09:31Z

This is not complicated ;)
This is necessary to have proper transcoding support for proper audio quality.

There's a lot of details in the discussion part.

Clients should have control on what quality they want. If I can't play some format like DSD, I do not want to receive low quality MP3, I want hi res FLAC.

When I cast to a Sonos device that does not support my FLAC 24/96 I want to receive FLAC 24/48 and now low quality mp3 or opus.

Same when casting to chromecast and a million other cases.

All major media providers have such an API and this is the main missing part of Subsonic for audiophiles and casting.

lachlan-00 · 2025-07-25T00:53:33Z

that explains why i'm not following it well, i listen to whatever source i'm given. but that makes sense now

Tolriq · 2025-09-05T06:24:37Z

@opensubsonic/servers So holidays are now mostly over :) Any objections or remarks on the draft would be nice to be done before the final polish and having to drop / rewrite everything.

sentriz · 2025-09-10T20:15:57Z

+1 on the complicated topic. I don't understand why we couldn't get by with extending the current stream.view?format= param

this param has been around for years, but it's been ambiguous for servers outside the original subsonic server

but I see it as just a key for the format the client is requesting, if we had something like

getFormats.view which returned say

{
"format_1": {"codec": ..., "bitrate": ..., "sampleRate", ..., "mime", ..., "etc", ...},
"format_2": {"codec": ..., "bitrate": ..., "sampleRate", ..., "mime", ..., "etc", ...},
"format_3": {"codec": ..., "bitrate": ..., "sampleRate", ..., "mime", ..., "etc", ...},
}

these represent the possible formats a server can transcode to

the client can choose to use a format or not on its own without the server's knowledge

the song's original bitrate/sampleRate/etc is already known from the Child response:

so the client sees there is a transcode option resulting in a bitrate < the songs bitrate, and in a codec it can play. it can choose a format. format_1 for example

then request it getStream.view?format=format_1

this is also backwards compatible with the original subsonic server and only a small extension

so i wonder, which problem does this solution fail to address?

sentriz · 2025-09-10T20:24:56Z

also as a side, how do these changes interact with the transcodedContentType and transcodedSuffix which clients use?

lachlan-00 · 2025-09-10T23:06:34Z

also as a side, how do these changes interact with the transcodedContentType and transcodedSuffix which clients use?

I've always treated these as the default output when running stream/download without additional options. User selectable format wouldn't affect the defaults outputs in that case.

Tolriq · 2025-09-11T06:02:26Z

I've already given a dozen examples about the need and why and all the other media providers providing such API because it is necessary to address a lot of cases.

Transcoding is not about a couple of pre defined server list, it's about having control of the result for the best result for the user.

Again if I want to cast my hires FLAC 24/96 to my Sonos device I want hi res FLAC 24/48 to have the best sound. I do not want to have to choose between mp3 and opus because that's the only 2 default values the server have.

I also do not want my DSD files transcoded to mp3 or to have to force a bitrate I want a format that I support.

xHE AAC, ... and so many different needs depending on the player and the cast target. When I cast to the phone I want 2 channels, when I cast to my hi end AVR I want to keep the 6 channels.

Each device, UPnP renderer, Chromecast, ... will have a unique list of supported combinations of parameters, this can't be handled with 3 pre defined profiles on the server.

And the details from Child are not precise enough mime and suffix are more about container than actual detailed codec informations.

also as a side, how do these changes interact with the transcodedContentType and transcodedSuffix which clients use?

This does not change anything on the fact they are random values as servers already supported multiple profiles ;) Most servers report them as the default transcoded result if the user does not request a transcode but the server force it.

Something that is also an issue currently, if a server decide to transcode on it's own due to it's internal settings, users are not really aware and we can't properly use the seek extension to properly seek in those transcodes.

TL;DR; The current solution is ultra limited and while it may fit some basic needs, it's not a proper solution for a mature streaming solution that OpenSubsonic needs to compete with the rest of the eco system.

sentriz · 2025-09-11T09:51:29Z

I've already given a dozen examples about the need and why and all the other media providers providing such API because it is necessary to address a lot of cases.

Transcoding is not about a couple of pre defined server list, it's about having control of the result for the best result for the user.

Again if I want to cast my hires FLAC 24/96 to my Sonos device I want hi res FLAC 24/48 to have the best sound. I do not want to have to choose between mp3 and opus because that's the only 2 default values the server have.

I also do not want my DSD files transcoded to mp3 or to have to force a bitrate I want a format that I support.

I'm not talking about 3 formats. There could be 10s or 100s of them. All the possible codecs, sample rates, channels, bitdepths. The server has the control here to
not show combinations of parameters which aren't possible or don't make sense.

So if you want FLAC 24/48, you choose that option. If that would be upsampling, you don't choose it

for example an incomplete list of formats:

{ "name": "flac_24_48k", "codec": "flac", "bitDepth": 24, "sampleRate", 48000},
{ "name": "flac_16_44k", "codec": "flac", "bitDepth": 14, "sampleRate", 44100},
{ "name": "opus_192",    "codec": "ogg", "bitRate", 192 } // lossy, no bitrate or sample rate
{ "name": "opus_128",    "codec": "pgg", "bitRate", 192 }, // lossy, no bitrate or sample rate

Note how we don't show sampleRates and bitDepths for lossy formats. That's something the server needs to control

xHE AAC, ... and so many different needs depending on the player and the cast target. When I cast to the phone I want 2 channels, when I cast to my hi end AVR I want to keep the 6 channels.

Each device, UPnP renderer, Chromecast, ... will have a unique list of supported combinations of parameters, this can't be handled with 3 pre defined profiles on the server.

This can still be supported, with the above stuff

This proposal has the benefit of actually being feasible to implement, for servers.

And the details from Child are not precise enough mime and suffix are more about container than actual detailed codec information.

Then we can enhance this information, if it's not enough. And in a backwards compatible way. This info would be needed for this "format=" approach so that the client can correctly choose the format in wants by comparing these valuses to the getFormats values

Tolriq · 2025-09-11T11:42:04Z

All the possible codecs, sample rates, channels, bitdepths.

This is not 10 or 100, this is multiple thousands of combinations: protocol, codecs, subCodec, containers, bitdepth, samplerate, channels. Without even talking about bitrate.

This proposal has the benefit of actually being feasible to implement, for servers.

So if this proposal that is actually implemented by Plex, Emby and Jellyfin is not possible implement how did they did it?
This proposal is actually not hard to implement and if you don't think that you can do it then do not implement the extension.

I'm sorry, but listing thousands of combinations makes absolutely no sense. Either we implement a proper transcoding engine or we don't. But what you propose is not a solution to the need of the users and the clients.

If the server is able to automatically generate the list of the thousands of combinations then it can easily implement this feature as proposed in a proper way. If it's not capable and you need to manually enter them, then this server will not be able to fit the users need either.

gravelld · 2025-09-15T15:38:23Z

getTranscodeDecision is a little clumsy for a term. How about *[T|t]ranscodeDecision -> *[T|t]ranscodeStreams? i.e. getTranscodeStreams returns a transcodeStream object

In the case of Plex et al, is it implemented this way because there's an implied control over the client? i.e. is knowledge about the client embedded in the server side code that creates the decision? One example: if there are multiple competing equivalent codecs specified, say flac and alac, how is it decided which is returned? A client may want to override that decision if the alternatives are essentially equivalent to the decision strategy. If it's to do with ordering in the query, this needs documenting.

I guess I'm not clear on the sort of control that can be be exerted by the client.

Tolriq · 2025-09-15T19:21:06Z

100% of the control is done by the client it gives a list of everything is can directly play and a list of wanted transcode profile IN ORDER. If the media fit the direct play profile then the server says you can direct play else it takes the transcoding profile in order and see the first it can do and return the necessary data for it.

The terms are related to the function. The first one is asking for a decision that can contain a transcode and the second is there to actual get the transcoded content like stream.view it does not return an object.

So IMO the getTranscodeDecision is coherent with all the other get endpoints the transcodeStream can be renamed to match the current stream both makes sense (Like we have getLyrics or getCaptions to extract data from a track)

epoupon · 2025-10-27T20:47:25Z

In order to better understand this PR, I spent a couple of hours implementing it.
At first, I found it a bit overcomplicated, but it eventually made sense to me.
So I think it would really be a nice addition to the API OS, and it is not that complicated to implement on the server side.

Here’s what I noted:

Mismatch between songId in getTranscodeDecision and trackID in getTranscodeStream.
"maxAudioChannels" could possibly be shifted from DirectPlayProfile / TranscodingProfile to ClientInfo directly (seems to have the same usage as for the max bitrate).
Make it clear we don’t expect multiple values in the CodecProfile structs.
It looks like Jellyfin may mix up container names with file extensions; "opus" would be a valid container, considered to be "ogg". Not sure we want this.
For maxAudioBitrate and maxTranscodingAudioBitrate, I guess no value means no limit (should be written down)? Or make it mandatory but 0 means no limit?
Not sure about offset in the getTranscodeDecision endpoint since we can also set it in getTranscodeStream. Is the latter an offset to apply on top of the first one? An override? I’d just remove offset from getTranscodeDecision (it’s not part of the decision anyway).
"transcodeReasons" is an array, but it’s not clear which reason applies to which direct play profile or codec profile.
Looks like we also need AudioBitdepthNotSupported, which is currently missing.

Tolriq · 2025-10-28T08:29:37Z

Mismatch between songId in getTranscodeDecision and trackID in getTranscodeStream.

Yes.

"maxAudioChannels" could possibly be shifted from DirectPlayProfile / TranscodingProfile to ClientInfo directly (seems to have the same usage as for the max bitrate).

As explained it's not at the top for optimisations reasons during playback. Your audio engine on the phone can convert a 6 channels to stereo during playback so you can support 6 channels in directplayprofiles, but directly converting to 2 channels if there's transcoding lower CPU usage on the client since there will already be some transcoding it's better to have the server working than the phone.

Make it clear we don’t expect multiple values in the CodecProfile structs.

Yes

It looks like Jellyfin may mix up container names with file extensions; "opus" would be a valid container, considered to be "ogg". Not sure we want this.

The values are normally the ffmpeg values, if a clients sends invalid or unknown values they should just be ignored.

For maxAudioBitrate and maxTranscodingAudioBitrate, I guess no value means no limit (should be written down)? Or make it mandatory but 0 means no limit?

0 means no limit as some clients will always encode fields. We can either make them mandatory or not as people prefer.

Not sure about offset in the getTranscodeDecision endpoint since we can also set it in getTranscodeStream. Is the latter an offset to apply on top of the first one? An override? I’d just remove offset from getTranscodeDecision (it’s not part of the decision anyway).

Yes it's a leftover before moving to just a transcodeParams and not a full url.

"transcodeReasons" is an array, but it’s not clear which reason applies to which direct play profile or codec profile.
Looks like we also need AudioBitdepthNotSupported, which is currently missing.

Yes there's probably some errors missing, IMO raw string is enough as in all cases this will more be for the dev than to expose nice messages to users.

epoupon

Thanks for the update!

content/en/docs/Endpoints/getTranscodeStream.md

content/en/docs/Responses/DirectPlayProfile.md

content/en/docs/Payloads/Limitation.md

content/en/docs/Responses/transcodeDecision.md

content/en/docs/Responses/TranscodingProfile.md

content/en/docs/Responses/transcodeDecision.md

openapi/endpoints/getTranscodeStream.json

Tolriq · 2025-11-08T19:09:14Z

@opensubsonic/servers @opensubsonic/clients The proposal is updated. It's present in an LMS build an proven working to address this important missing part for OS.

Tolriq · 2025-11-29T20:06:04Z

And actually this also applies to the comma separated codecs and the * if we follow your logic against parsing of the strings.

If you prefer I can use comma separated values to match the rest of the proposal, else I'll ask all others to vote on the split versus changing everything.

kgarner7 · 2025-11-29T20:06:30Z

You can already have conflicting limitations by just specifying <= number and > number for the same field. If you absolutely want to eliminate the chance of conflicting limitations, then you would need to do something like the following, where each field is an optional single rule.

{
    "audioChannels": { "comparison": "LessThan", "value": 20 },
    "audioBitrate": {},
    "audioProfile": {},
    "audioSamplerate": {},
    "audioBitdepth": {}
}

Of course, if you want to support multiple separate limitations for the same value (e.g., one that is required and one that is not), then this alternate schema would prevent that.

Tolriq · 2025-11-29T20:12:33Z

When all is in the same list then we can apply a simple rule as for the rest of the direct play and transcoding profile to take the first one. Nothing complicated and fancy.

But as polymorphism this is mostly to simplify CLIENT side to not have to write complex code to write the proper data in the proper field to generate the proper JSON at the end.

At some point we need to be logical about who will use the API and how. That's the clients that needs to rebuild the profiles from their player support and even worse when dealing with UPnP and custom profiles that users will need to provide for their specific devices.

That's what matters. An API that is lisible and that people can use and that we can maintain without breaking changes.

kgarner7 · 2025-11-29T20:19:41Z

When all is in the same list, you still have problems of incompatible rules, because the same item can be specified multiple times regardless. At the end of the day, you are trying to encode a list for certain parameters. In JSON, there's a type for that: it's called an array.

At this point, I don't really care about the comparisons either way. I'd love for = and != to be formally expressed as an array in the schema, but /shrug/. That being said, at least for the other types that are always a list, you should make it so. And then, rather than say *, just make the list empty, since as I understand, the point of the container, audioCodec and protocol lists is to restrict the allowable fields.

epoupon

Thanks a lot for all this work!

kgarner7 · 2025-11-30T15:06:01Z

Thanks for the changes. Please do also add

opensubsonic:
- Extension

for the endpoints for tracking

paulijar

With this one final typo fixed, the PR can be approved.

content/en/docs/Endpoints/getTranscodeDecision.md

New extension to have proper transcoding solution in OpenAPI.

Tolriq marked this pull request as draft July 15, 2025 09:03

kgarner7 previously requested changes Jul 20, 2025

View reviewed changes

epoupon mentioned this pull request Aug 15, 2025

Disk caching epoupon/lms#683

Closed

Tolriq force-pushed the transcoding branch from a7d1b00 to 08d54d5 Compare November 7, 2025 09:14

epoupon reviewed Nov 7, 2025

View reviewed changes

Tolriq force-pushed the transcoding branch from 08d54d5 to 251ef1c Compare November 8, 2025 10:12

epoupon reviewed Nov 8, 2025

View reviewed changes

openapi/endpoints/getTranscodeStream.json Outdated Show resolved Hide resolved

epoupon previously approved these changes Nov 8, 2025

View reviewed changes

Tolriq dismissed epoupon’s stale review via 41aa469 November 8, 2025 12:58

Tolriq force-pushed the transcoding branch from 251ef1c to 41aa469 Compare November 8, 2025 12:58

Tolriq marked this pull request as ready for review November 8, 2025 14:34

epoupon previously approved these changes Nov 8, 2025

View reviewed changes

Tolriq requested a review from kgarner7 November 8, 2025 19:06

Tolriq dismissed stale reviews from paulijar and epoupon via 156672c November 30, 2025 08:35

Tolriq force-pushed the transcoding branch 4 times, most recently from 282d966 to 381410e Compare November 30, 2025 10:13

epoupon previously approved these changes Nov 30, 2025

View reviewed changes

Tolriq dismissed epoupon’s stale review via 179d514 November 30, 2025 17:45

Tolriq force-pushed the transcoding branch from 381410e to 179d514 Compare November 30, 2025 17:45

kgarner7 previously approved these changes Nov 30, 2025

View reviewed changes

epoupon previously approved these changes Nov 30, 2025

View reviewed changes

Tolriq enabled auto-merge (squash) November 30, 2025 17:49

Tolriq requested a review from paulijar November 30, 2025 18:08

paulijar requested changes Nov 30, 2025

View reviewed changes

content/en/docs/Endpoints/getTranscodeDecision.md Outdated Show resolved Hide resolved

Add new transcoding extension

7c33949

New extension to have proper transcoding solution in OpenAPI.

Tolriq dismissed stale reviews from epoupon and kgarner7 via 7c33949 November 30, 2025 20:36

Tolriq force-pushed the transcoding branch from 179d514 to 7c33949 Compare November 30, 2025 20:36

paulijar approved these changes Nov 30, 2025

View reviewed changes

epoupon approved these changes Nov 30, 2025

View reviewed changes

kgarner7 approved these changes Nov 30, 2025

View reviewed changes

Tolriq disabled auto-merge December 1, 2025 07:20

Tolriq merged commit 04d932e into opensubsonic:main Dec 1, 2025
4 checks passed

epoupon mentioned this pull request Dec 2, 2025

New transcode Opensubsonic API epoupon/lms#787

Closed

sentriz mentioned this pull request Dec 5, 2025

[Feature Request] Mandate a final transcoding rule for * sentriz/gonic#633

Open

Add new transcoding extension #168

Add new transcoding extension #168

Uh oh!

Conversation

Tolriq commented Jul 15, 2025

Uh oh!

netlify bot commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for opensubsonic ready!

Uh oh!

kgarner7 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Tolriq commented Jul 20, 2025

Uh oh!

lachlan-00 commented Jul 23, 2025

Uh oh!

Tolriq commented Jul 23, 2025

Uh oh!

lachlan-00 commented Jul 25, 2025

Uh oh!

Tolriq commented Sep 5, 2025

Uh oh!

sentriz commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sentriz commented Sep 10, 2025

Uh oh!

lachlan-00 commented Sep 10, 2025

Uh oh!

Tolriq commented Sep 11, 2025

Uh oh!

sentriz commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Tolriq commented Sep 11, 2025

Uh oh!

gravelld commented Sep 15, 2025

Uh oh!

Tolriq commented Sep 15, 2025

Uh oh!

epoupon commented Oct 27, 2025

Uh oh!

Tolriq commented Oct 28, 2025

Uh oh!

epoupon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Tolriq commented Nov 8, 2025

Uh oh!

Tolriq commented Nov 29, 2025

Uh oh!

kgarner7 commented Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Tolriq commented Nov 29, 2025

Uh oh!

kgarner7 commented Nov 29, 2025

Uh oh!

epoupon left a comment

Choose a reason for hiding this comment

Uh oh!

netlify bot commented Jul 15, 2025 •

edited

Loading

sentriz commented Sep 10, 2025 •

edited

Loading

sentriz commented Sep 11, 2025 •

edited

Loading

kgarner7 commented Nov 29, 2025 •

edited

Loading